research highlight
How Green are Neural Language Models? Analyzing Energy Consumption in Text Summarization Fine-tuning
Rehman, Tohida, Sanyal, Debarshi Kumar, Chattopadhyay, Samiran
Artificial intelligence systems significantly impact the environment, particularly in natural language processing (NLP) tasks. These tasks often require extensive computational resources to train deep neural networks, including large-scale language models containing billions of parameters. This study analyzes the trade-offs between energy consumption and performance across three neural language models: two pre-trained models (T5-base and BART-base), and one large language model (LLaMA 3-8B). These models were fine-tuned for the text summarization task, focusing on generating research paper highlights that encapsulate the core themes of each paper. A wide range of evaluation metrics, including ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore, were employed to assess their performance. Furthermore, the carbon footprint associated with fine-tuning each model was measured, offering a comprehensive assessment of their environmental impact. This research underscores the importance of incorporating environmental considerations into the design and implementation of neural language models and calls for the advancement of energy-efficient AI methodologies.
- Asia > India > West Bengal > Kolkata (0.05)
- South America > Brazil (0.04)
- North America > United States > Iowa (0.04)
- (3 more...)
How to 3D print fully-formed robots
To overcome this, a team has combined inkjet printing with an error-correction system guided by machine vision, to allow them to print sophisticated multi-material objects. They used this method to make a bio-inspired robotic hand that combines soft and rigid plastics to make mechanical bones, ligaments, and tendons, as well as a pump based on a mammalian heart. Citizen-scientists help identify an astronomical object that blurs the line between asteroid and comet, and how a Seinfeld episode helped scientists to distinguish the brain regions involved in understanding and appreciating humour. Type 2 diabetes affects hundreds of millions of people around the world and represents a significant burden on healthcare systems. But behaviour change programmes -- also known as lifestyle interventions -- could potentially play a large role in preventing people from developing type 2 diabetes. This week in Nature a new paper assesses how effective this kind of intervention might be.
Generation of Highlights from Research Papers Using Pointer-Generator Networks and SciBERT Embeddings
Rehman, Tohida, Sanyal, Debarshi Kumar, Chattopadhyay, Samiran, Bhowmick, Plaban Kumar, Das, Partha Pratim
Nowadays many research articles are prefaced with research highlights to summarize the main findings of the paper. Highlights not only help researchers precisely and quickly identify the contributions of a paper, they also enhance the discoverability of the article via search engines. We aim to automatically construct research highlights given certain segments of a research paper. We use a pointer-generator network with coverage mechanism and a contextual embedding layer at the input that encodes the input tokens into SciBERT embeddings. We test our model on a benchmark dataset, CSPubSum, and also present MixSub, a new multi-disciplinary corpus of papers for automatic research highlight generation. For both CSPubSum and MixSub, we have observed that the proposed model achieves the best performance compared to related variants and other models proposed in the literature. On the CSPubSum dataset, our model achieves the best performance when the input is only the abstract of a paper as opposed to other segments of the paper. It produces ROUGE-1, ROUGE-2 and ROUGE-L F1-scores of 38.26, 14.26 and 35.51, respectively, METEOR score of 32.62, and BERTScore F1 of 86.65 which outperform all other baselines. On the new MixSub dataset, where only the abstract is the input, our proposed model (when trained on the whole training corpus without distinguishing between the subject categories) achieves ROUGE-1, ROUGE-2 and ROUGE-L F1-scores of 31.78, 9.76 and 29.3, respectively, METEOR score of 24.00, and BERTScore F1 of 85.25.
- Asia > India > West Bengal > Kolkata (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > India > West Bengal > Kharagpur (0.05)
- (9 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.95)
- Education (0.93)
- Health & Medicine > Therapeutic Area (0.67)
Named Entity Recognition Based Automatic Generation of Research Highlights
Rehman, Tohida, Sanyal, Debarshi Kumar, Majumder, Prasenjit, Chattopadhyay, Samiran
A scientific paper is traditionally prefaced by an abstract that summarizes the paper. Recently, research highlights that focus on the main findings of the paper have emerged as a complementary summary in addition to an abstract. However, highlights are not yet as common as abstracts, and are absent in many papers. In this paper, we aim to automatically generate research highlights using different sections of a research paper as input. We investigate whether the use of named entity recognition on the input improves the quality of the generated highlights. In particular, we have used two deep learning-based models: the first is a pointer-generator network, and the second augments the first model with coverage mechanism. We then augment each of the above models with named entity recognition features. The proposed method can be used to produce highlights for papers with missing highlights. Our experiments show that adding named entity information improves the performance of the deep learning-based summarizers in terms of ROUGE, METEOR and BERTScore measures.
Research Highlights: R&R: Metric-guided Adversarial Sentence Generation - insideBIGDATA
Large language models are a hot topic in AI research right now. But there's a hotter, more significant problem looming: we might run out of data to train them on … as early as 2026. Kalyan Veeramachaneni and the team at MIT Data-to-AI Lab may have found the solution: in their paper on Rewrite and Rollback ("R&R: Metric-Guided Adversarial Sentence Generation") just published in the Findings of AACL-IJCNLP, an R&R framework can tweak and turn low-quality (from sources like Twitter and 4Chan) into high-quality data (texts from sources like Wikipedia and industry websites) by rewriting meaningful sentences and thereby adding to the amount of the right type of data to test and train language models on. Here is the peer-reviewed paper for your reference: https://aclanthology.org/2022.findings-aacl.41.pdf Kalyan Veeramachaneni is a principal research scientist at the MIT Schwarzman College of Computing.
Research Highlights: Pen and Paper Exercises in Machine Learning - insideBIGDATA
This paper consists of a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimization, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalized models), sampling and Monte-Carlo integration, and variational inference. Highly recommended for data scientists wishing to evolve their understanding of the mathematical foundations of the field.
Research Highlight: Enabling Robot Interaction With Articulated Objects
Research from Carnegie Mellon University's Robotics Institute could one day allow robots to seamlessly open drawers, doors and lids on hinges. While humans interact with various articulated objects daily -- opening a refrigerator door or lifting a toilet seat are good examples -- these tasks present a challenge in robotics. Ben Eisner and Harry Zhang, both graduate students in Assistant Professor David Held's Robots Perceiving and Doing Lab, designed a new way to train robots to perceive and manipulate articulated objects in their project, "FlowBot3D: Learning 3D Articulation Flow To Manipulate Articulated Objects." The team presented their research at Robotics: Science and Systems this year, where it was a finalist for a best paper award. FlowBot3D uses a vision-based system to help robots learn how to interact with many different kinds of articulated objects.
Research Highlights: Interactive continual learning for robots: a neuromorphicapproach - insideBIGDATA
Overview: Researchers at Intel Labs, in collaboration with the Italian Institute of Technology and the Technical University of Munich, have introduced a new approach to neural network-based object learning, specifically targeting future robotics applications such as robotic assistants that interact with unconstrained environments – in situations like logistics, health- or elderly care. The researchers developed new models that successfully demonstrated continual interactive learning on Intel's neuromorphic research chip measuring up to 175x lower energy to learn a new object instance with similar or better speed and accuracy compared to conventional methods running on a central processing unit (CPU). This research is a crucial step in improving the capabilities of future assistive or manufacturing robots using neuromorphic computing to enable them to adapt to the unforeseen and work more naturally alongside humans. Read the full paper HERE, which was named "Best Paper" at this year's International Conference on Neuromorphic Systems (ICONS) hosted by Oak Ridge National Laboratory.
Research Highlights: Why Do Tree-based Models Still Outperform Deep Learning on Tabular Data? - insideBIGDATA
Title of paper: Why do tree-based models still outperform deep learning on tabular data? Abstract: While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear. We contribute extensive benchmarks of standard and novel deep learning methods as well as tree-based models such as XGBoost and Random Forests, across a large number of datasets and hyperparameter combinations. We define a standard set of 45 datasets from varied domains with clear characteristics of tabular data and a benchmarking methodology accounting for both fitting models and finding good hyperparameters. Results show that tree-based models remain state-of-the-art on medium-sized data ( 10K samples) even without accounting for their superior speed.
Research Highlights: Using Theory of Mind to improve Human Trust in Artificial Intelligence - insideBIGDATA
Artificial Intelligence (AI) systems are threaded throughout modern society, informing us in low-risk interactions such as movie recommendations and chatbots to high-risk environments like medical diagnosis, self-driving cars, drones, and military operations. But is remains a significant challenge to develop human trust in these systems, particularly because the systems themselves cannot explain in a way graspable to humans how a recommendation or decision was reached. This lack of trust can become problematic in critical situations involving finances or healthcare where AI decisions can have life-altering consequences. To address this issue, eXplainable Artificial Intelligence (XAI) has become an active research area both for scientists and industry. XAI develops models using explanations that aim to shed light on the underlying mechanisms of AI systems, thus bringing transparency to the process.